# Self-Supervised Learning
Dinov2 Base ONNX
This is the ONNX format version of the facebook/dinov2-base model, suitable for computer vision tasks.

D
onnx-community
19
0
AV HuBERT MuAViC Ru
AV-HuBERT is an audio-visual speech recognition model trained on the MuAViC multilingual audio-visual corpus, combining audio and visual modalities for robust performance.
Audio-to-Text
Transformers

A
nguyenvulebinh
91
1
Dna2vec
MIT
DNA sequence embedding model based on Transformer architecture, supporting sequence alignment and genomics applications
Molecular Model
Transformers

D
roychowdhuryresearch
557
1
TITAN
TITAN is a multimodal whole slide foundation model pre-trained through visual self-supervised learning and vision-language alignment for pathology image analysis.
Multimodal Fusion
Safetensors English
T
MahmoodLab
213.39k
37
AV HuBERT
A multilingual audio-visual speech recognition model based on the MuAViC dataset, combining audio and visual modalities for robust performance
Audio-to-Text
Transformers

A
nguyenvulebinh
683
3
Dinov2.large.patch 14.reg 4
Apache-2.0
DINOv2 is a vision transformer-based image feature extraction model that enhances feature extraction capabilities through the introduction of register mechanisms.
D
refiners
15
0
Dinov2 Large
DINOv2 is a visual model released by Facebook Research that extracts general visual features through self-supervised learning, suitable for various downstream tasks.

D
Xenova
82
1
Electra Small Generator
Apache-2.0
ELECTRA is an efficient text encoder that achieves excellent performance with lower computational power through discriminative pretraining rather than generative pretraining
Large Language Model English
E
google
11.07k
12
Wav2vec2 FR 3K Base
Apache-2.0
A wav2vec2 base model trained on 2.9K hours of French speech, supporting spontaneous, read, and broadcast speech
Speech Recognition
Transformers French

W
LeBenchmark
31
0
Featured Recommended AI Models